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A system to aid in the post-mortem debugging of 
assembler language programs written for the IBM System/360-370 series 
of computers is described in this master's thesis. A user's manual 
for the system, with descriptions of user requirements. Job Control 
Language (JCL) statements, and system output, comprises the first 
chapter. The program logic structure is explained in the second 
chapter. The third chapter discusses the termination analysis routine 
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types handled by the system. Conclusions and directions for future 
work are presented in the fourth section. Finally, an appendix of 
sample programs, with a listing of all the JCL statements needed to 
use the diagnostics systems, is provided. (HDR) 



BD 092 153 

AUTHOR 
TITLE 

INSTITUTION 

SPONS AGENCY 
REPORT NO 
PUB DATE 
NOTE 

EDRS PRICE 
DESCRIPTORS 

IDENTIFIERS 



ABSTRACT 



ERIC 



TECHNICAL REPORT SERIES 



m\ CUpy AVAILABLE 




CQiriPUTER g 
SCiEI^CE 

REEEFIHCH CEI^TEH 



ERIC 



THE OHIO STATE UNIVERSITY COLUMBUS, OHIO 



(OSU-CISRC-TR-74-3) 



I 



AN IMPROVED ERROR DIAGNOSTICS SYSTEM 
FOR IBM SYSTEM/360 - 370 
ASSEMBLER PROGRAM DUMPS 

by 

Barry M. Kirsch 

/ 



Work perfornied in part under 
Grant No, 534.1, National Science Foundation 



USOEI>ARTWENT0<=MCALTM. 
EDUCATION S WELFARE 
NAT>ONAL INSTITUTE Of 
EDUCATION 

THIS OOCU VE NT HAS P f f N K E P WO 
DUCED EXACTLY AS PtCfivTO f WOV 
THE PE WSON OW OfcfGtNi^ATiON 0« 
A T 1 \ r> I T PO I N T S 0 f W I E A C O P ) N ► 0 N S 
SEATED 00 NOT NtCES^CM^v RE P^vE 
SENTO^f'ClAL NtTiOSAl iNSHTuTrO^ 
EDUCATION POS'TlON 01? POL'CV 



The Computer and Information Science Research Center 
The Ohio State University 
Columbus, Ohio 43210 

June 1974 ^ 



PREFACE 



This work was done in partial fulfillment of the requirements 
for the Master of Science degree in Computer and Information Science 
from The Ohio State University. It was supported in part by Grant 
No. 6N 534.1 from the Office of Science Information Service, the 
National Science Foundation, to the Computer and Information 
Science Research Centar of The Ohio State University, 

The Computer and Information Science Rese?.rch Center of The 
Ohio State University is an interdisciplinary research organization 
which consists of the staff, graduate studerts, and faculty of many 
University departments and labaratories . This report is based on 
research accomplished in cooperation with the Department of Computer 
and Information Scien^^e. 

The research was administered and monitored by The Ohio State 
University Research Foundation. 



ERIC 



ACKNOWLEDGEMENTS 



I wish to express my thanks and appreciation to my adviser. 
Dr. Robert F. Mathis, for his advice, assistance, and encouraqement 
in the preparation of this thesis. 

Thanks are also due to Dr. Clinton R. Foulk for serving on 
my reading committee, and to Rick Baum for allowing me to use, 
essentially unmodified, his hexadecimal- to-decimal floating-point 
register conversion routine. Computer time for the development 
and testing of the system was provided by The Ohio State University 
Instruction and Research Computer Center. I v/ould also like to 
acknowledge my support as a University Fellow by the r.raduate School 
at The Ohio State University. 

I wish to thank my parents for their moral support and 
encouragement during my educational career, and indeed, throughout 
my life. Finally, special thanks are due to my wife Debbie for her 
support, patience, and many good-natured sacrifices during the 
preparation and completion of this work. 



TABLE OF CONTENTS 

Page 

ACKNOWLEDGEMENTS M 

INTRODUCTION 1 

CHAPTER I - USER'S MANUAL 4 

1 ) General 5 

2) User Requirements 5 

3) Necessary JCL Statements 9 

4) Output Description 10 

CHAPTER II - PROGRAM LOGIC STRUCTURE 13 

1) General 14 

2) Description of INTERUPT 17 

3) General Description of the Termination Analysis Routines 19 
CHAPTER III - TERMINATION ANALYSIS ROUTINE LOGIC 22 

1) General 23 

2) OCl - Invalid Operation 23 

3) 0C2 - Privileged Operation 24 

4) 0C3 " Invalid EXecute 25 

5) 0C4 - Protection 26 

6) 0C5 - Addressing 27 

7) 0C45 - Traces Instructions for 0C4 and 0C5 Interruptions 27 

8) 0C6 - Specification 29 

9) 007 - Data 30 

10) 0C8 - Fixed-Poirit Overflow 31 

11) 0C9 - Fixed-Point Divide 31 

iv 

ERIC 



TABLE OF CONTENTS - continued Page 

12) OCA - Decimal Overflow 32 

13) OCB - Decimal Divide 32 

14) OCC - Exponent Overflow 32 

15) OCD - Exponent Underflow 33 

16) OCE - Significance 33 

17) OCF - Floating-Point Divide . 34 

CHAPTER IV - CONCLUSIONS AND DIRECTIONS FOR FUTURE WORK . . i . 35 

1) Conclusions . . ...... 36 

2) Directions for Future Work 37 

APPENDIX - SAMPLE PROGRAMS ) 39 

BIBLIOGRAPHY 60 



ERIC 



V 



INTRODUCTION 



Post-mortem debugging of assembler language programs is, for the 
most part, a difficult, time-consuming task requiring in many cases 
detailed familiarity with the system being used. Most assembler lan- 
guage programmers would surely benefit from improved post-mortem er- 
ror diagnostics, yet very little practical work has been dom^ in this 
area. With few exceptions, the standard manufacturer-supplied system 
dump is all the assembler programner has recourse to in determining 
the reason for his program's abnormal termination. The system dump 
generally contains all tht information necessary for the user to ac- 
curately deduce the cause of his error, but cloaks it among much ir- 
relevant data in an extremely hard-to-read formatv It is only the 
experienced programmer, knowledgeable in reading dumps and accustomed 
to the different sources of program errors, who does not have diffi- 
culty in poring over the pages upon pages of data in the typical sys- 
tem dump. Even he should benefit from a more compact, informative, 
and easier-to-read post-mortem diagnostic analysis. 

Described herein is an error diagnostics system for assembler 
language programs written for the IBM System/360 - 370 series of com- 
puters. At the programmer's option, any or all of the 15 program in- 
terruptions can be trapped and handled by the error diagnostics package 
instead of or in addition to the standard system dump. Although the 
diagnostics system is not able to determine the precise cause of every 
possible program error, its use should facilitate the process of post- 
mortem prog*^am debugging. 



'The output produced by the diagnostics system is designed to pro- 
vide only that information which would be of the greatest value to the 
user in debugging his program. No system control blocks or "raw" data 
are printed. Instead, relevant data items are taken from the control 
blocks, the program^s core image, and data areas, and then programma- 
tically analyzed; only that information which will aid the user most 
in the debugging process is printed, organized in a comprehensive, in- 
formative format. 

The system was written for and tested on the IBM System/370 Model 
165 computer operating under OS/MVT (operating system providing mul- 
tiprogrdi-nm^'ng with a variable number of tasks), at the Instruction 
and Research Computer Center of The Ohio State University. Although 
djsi gned speci f i cal 1 y for use wi th thi s conf i gurati on , the sys tern 
could be easily modified to run on other models of the IBM System/360- 
370 computer series operating under any of the IBM-supplied operating 
systems. 

The author has attempted to incorporate into the system all of 
his knowledge of the various errors that may result in program inter- 
ruptions on the computer in question, along with the many possible 
causes of these errors. Information of this type taken from a number 
of books and publications has also been included. However the logic 
contained in the programs comprising the error diagnostics system is 
by necessity incomplete. Other errors are sure to arise that are not 
covered by the program logic. In particular, among error conditions 
resulting from attempted input/output operations, logic related to only 
those CfiUsed by execution of the queued sequential access method (QSAM) 



GET and PUT macros has been included, QSAM is by far the most cofnnonly 
used access method in assembler language programs (especially in a stu- 
dent environment)* Code covering the error conditions resulting from 
use of the macros of other access methods might be a desirable future 
enhancement, Since the system has been designed in a modular formati 
logic covering any additional error types and/or causes may be easily 
added at a later time. 

The requirements for a programmer to use the system have been 
deliberately kept to a minimum. Although much of this thesis is writ- 
ten for the person with some knowledge of the details of the System/ 
360 - 370 Operating System and its interrupt structure, little back- 
ground is required to actually use the error diagnostics package. The 
reader with little or no experience is referred to the appendix where 
the first example given shows all the program and job control language 
statements necessary for the successful use of the system. 
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General 

The error diagnostics system is capable of trapping and handling 
any or all of the fifteen program interruptions (system completion codes 
OCT - OCF) produced by assembler language programs executing on the 
IBM System/360 - 370 series of computers. The user determines which 
interrupt conditions are to be trapped and whether or not a system dump 
is to be produced along with the diagnostic output. Only one macro 
instruction within the user^s program, and two or three extra job con- 
trol language (JCL) cards (described below) are required to use the 
system. Since, at the user's option, a post-mortem system dump is also 
produced, any user may use the package without fear of not getting some 
information that would otherwise be produced using the standard system 
interface. Just as with the system dump, the use of the error diagnos- 
tics system results in the termination of the user's job; recovery from 
an interrupt condition is not possible. 

User Requirements 

The macro instruction should be included in the first control sec- 
tion (CSECT) in the task for which the error diagnostics system is to be 
in effect. The exit routine will then be given control when any of the 
specified program interruptions occurs in any program of the task. 

The macro instruction is written as follows: 

SPIESET name , ( interrupt! ons ) 

where: 

name is the name to be given to a control section automatically 
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generated in the user's region by the expansion of the macro. It may be 
1 - 6 characters in length and should be distinct from all other names 
used in the user program. Three labels, name prefixed with SP, ST, and 
SV, will also be generated - these too should be distinct. 

interrupti ons is one or more decimal numbers, separated by commas, 
indicating the corresponding interruption types shown below. The inter- 
ruption types can be designated in any order as follows: 

a) one or more single numbers, each indicating the corresponding 
program interruption type, or 

b) one or more pairs of decimal numbers, each pair indicating 
a range of corresponding interruption types. The second number must 
be higher than the first. The pairs of numbers must be separated 
from each other by commas and enclosed in an additional set of paren- 
theses. 

For example, a second operand of (4,8) indicates interruption 
types 4 and 8; ((4,8)) indicates interruption types 4 through 8, in- 
clusively. ((1,15)) indicates all interruption types 1 through 15; 
((3, 11), (13, 15)) indicates all interruption types except 1, 2, and 12. 
The interruption types are as follows: 

Number Interruption Type 



2 
3 
4 
5 
6 
7 
8 
9 

10 

11 



Invalid Operation 

Privileged Operation 

Invalid EXECUTE 

Protection 

Addressing 

Specification 

Data 

Fixed-Point Overflow (maskable) 
Fixed-Point Divide 
Decimal Overflow (maskable) 
Decimal Divide 
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Number Interruption Type 

12 Exponent Overflow 

13 Exponent Underflow (maskable) 

14 Significance (maskable) 

15 Floating-Point Divide 

Care should be taken in the specification of the intcrruol ion types. 
If a maskable interruption type is specified, the corresponding program 
mask bit in the program status word is set to 1, thus allowim; the inter- 
ruption to occur. If the programmer desires to mask any of these con- 
ditions (i.e., not permit an interruption when the condition is present), 
he should not specify the corresponding interruption number. Both 
operands of the SPIESET macro must always be coded; there are no default 
values. 

The result of the macro call is the setting of a SPIE macro (which 
is included in the expansion of the SPIESET macro), and the production 
Of a small control section in the user's region which, as a result of 
the SPIE, is given control at time of interrupt and then links to the 
error diagnosis routines. Although all the other routines comprising 
the error diagnosis system are reentrant, the CSECT produced 1s not in 
order to minimize in-core code and execution time overhead. Since the 
SPIESET maci^o is designed for use primarily in a "driver*' program within 
a testing environment, the fact that the generated control section is 
not reentrant should not be considered a significant drawback. However, 
the error diagnostics system should not be used by a program which will 
actually be run simultaneously by multiple users. 

If the user desires to change the interruption type specifications 
after issuing a SPIESET macro instruction, he may^ssue another SPIESET, 

ERLC 



8 

again specifying the name to be given to a generated control section and 
the interruption types upon the occurance of which the error diagnosis 
system is to be given control. The control section name specified must 
follow the rules given above and must be distinct from the name(s) spe- 
cified in any previously issued SPIESET macro instruction(s) , since 
another control section will now be generated with the name as given. 
The interruption types specified in the newly issued SPIESET should be 
complete - they totally override any previous specifications, and the 
previously generated control sectlon(s) are no longer given control 
under any circumstances. 

To cancel the effect of a SPIESET macro instruction, a SPIE macro 
with no operands may be issued. After a "cancel"' SPIE has been issued, 
the standard control program exit routine is once again given control 
upon any program interrupt condition. 

Since the SPIESET macro is generally issued only in a main or 
"driver" program within a testing environment, it is usually not neces- 
sary to reestablish the effect of any previously-issued SPIE macro 
before returning control. However, the user should make sure that the 
program that issues the SPIESET macro instruction does not return con- 
trol to a calling program, or transfer control (by issuing an XCTL macro 
instruction) to another program that is not fully debugged, and for 
which the error diagnosis system is not to be in effect, before issuing 
a "cancel" SPIE, 



Necessary JCL Statements * 

In the JCL for the assembler step in which the macro instruction 
is included, the user should concatenate the FRA380.KIRSCH macro library 
with the system macro library (SYSl .MACLIB) and any other macro libraries 
he might be using. This data set contains the definition of the SPIESET 
macro. • 

A JOBLIB or STEPLIB DD card specifying the FRA380 .KIRSCH2 data set 
must also be included in the user's JCL stream. It is on this data set 
that the object modules of the various termination analysis routines 
are kept. If a STEPLIB DD card is used, it should be included in the 
JCL for the user program's execution (GO) step. 

A SYSUDUMP or SYSABEND DD card should be included in the JCL state- 
ments for the user program's execution step if a system dump is desired 
in addition to the error diagnosis output. If no SYSUDUMP or SYSABEND 
DD card is provided, no dump will be produced. 

The error diagnostics system writes its output on a data set whose 
DDNAME is SYSPRINT. The user must thus provide a SYSPRINT DD card in 
the execution step directing the error diagnosis output to the desired 
device. Normally the output would be directed to the printer; in this 
case SYSOUT=A should be coded. Note that SYSPRINT is the DDNAME commonly 
assigned to the normal prograsn printer output. This has been designed 
so that in many cases an additional JCL card will not be required, and 
the error diagnosis output will follow directly any output the user 
program may have produced before its termination. 

* The data set names given in this section are those in use at the 
Instruction and Research Computer Center at The Ohio State University 
as of May, 1974. It is expected that these data sets will continue 
to be available. 
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A sample deck of a user job including assembler and loader (execute) 
steps, employing the ASMGRUN catalogued procedure, is given below. All 
the necessary JCL cards are included: 

Notes 

// Job Card 
1) //JOBLIB DD DSN=FRA380.KIRSCH2,DISP=SHR 
//STEP1 EXEC ASMGRUN 
//CMP.SYSLIB DD 

// DD DSN=FRA380.KIRSCH,DISP=SHR 

//CMP.SYSIN DD * 

User program including SPIESET macro instruction 

1) //GO.STEPLIB DD DSN=FRA380. KIRSCH2 ,niSP=SHR • >^ 
//60.SYSPRINT DD SYSOUT=A 

2) //GO.SYSUDUMP DO SYSOUT=A 

3) // 

Noi.es : 1) Either of the JOBLIB or STEPLI8 DD cards, but not both, 
is required. 

2) The SYSUDUMP DD card should be included only if a system 
dump, in addition to the error diagnosis output, is desired. 

3) Other DD cards may be needed in the 60 step depending on 
the individual user program requirements. 

The •.•eader is referred to the first example in the appendix, where 
an actual computer run is shown including all required user-supplied JCL 
statements and program instructions necessary to use the diagnosti':s 
system, ** 

Output Description 

The output produced by the error diagnostics system is in two 
parts. First is general information, common to all 15 interruption types. 



n 

In this section output first is the massage EXECUTION ERROR followed by 
the system completion code (OCl-OCF, corresponding to program interrup- 
tion types 1-15, respectively) and a brief description of the interrup- 
tion type. Next the location of the interrupt is printed with the name 
of the CSECT in which it occurred and its relative address, if it oc- 
curred within the user's program area. If the interrupt location falls 
outside this area, a note to that effect is printed. Next output are 
the contents of the/16 general purpose registers and four floating- 
point registers at time of interrupt in both hexadecimal and decimal 
representations. This is followed by the hexadecimal representation 
of the eight bytes of memory starting at the interrupt location, and a 
reconstruction of the instruction causing the interruption when this 
can be positively determined - i.e., for all interrupt types other 
than OCl (invalid operation), 0C4 (protection), and 0C5 (addressing). 

The second part of the output is diagnostic information keyed to 
the specific interruption type and its possible causes. This informa- 
tion includes a message explaining the cause of the interrupt in gene- 
ral (the messages produced for each interrupt type are described in 
Chapter III). This is followed by a detailed p\planation of the pre- 
cise cause of the error, insofar as this can be determined, and, when- 
ever possible, debugging hints aimed at correcting the" error condition. 

In the case of an invalid operation interruption (OCl), recon- 
struction of the instruction causing the exception is impossible, but 
an error diagnosis is printed and debugging aids are provided. Due to 
the instruction look-ahead feature of many models of the System/360 and 
370 computers, protection and addressing interruptions (types 0C4 and 



0C5) are imprecise* and it is thus impossible to determine exactly the 
instruction causing the interruption. In these cases the instructions 
immediately preceding the interrupt location are reconstructed and 
printed, those instructions capable of causing the interruption being 
flcfgged. This procedure is explained in more detail in Chapter III. 



CHAPTER II 
Program Logic Structure 
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General 

This chapter describes .the structure of and the linkages between 
the various modules that combine to make up the error diagnostics system. 

The first component of the system Is the user program interface. 
In order to utilize the package, a programmer must include a SPIESET 
macro instruction in the first control section of his job for which 
he desires the error diagnostics system to be in effect. This would 
generally (but need not) be the main or "driver" control section of 
the job. 

The SPIESET macro instruction expansion includes the following: 

1) the expansion of a SPIE (Set Program Interrupt Element) macro 
instruction, and 

2) the creation of a small control section in the user's program 

area. 

The SPIE macro instruction is used to specify an alternative exit 
routine to be given control when any of the specified program exceptions 
occur. This allows the user to bypass the processing of the standard 
control program exit routine which would otherwise be given control and 
abnormally terminate the task, producing a standard system dump. After 
the SPIESET macro instruction is executed control is transferred irstead 
to the generated control section upon any specified program interruption. 
The generated CSECT then brings the major routine of the error diagnostics 
system into main storage and passes control to it by issuing a LINK macro. 

The expansion of the SPIE macro instruction results in the forma- 
tion of a program interruption control area (PICA) which contains the 
new program mask for the interruption types that can be disabled, and a 
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code for the interruption types specified in the SPIESET macro. Also 
included in the PICA is the address of the new CSECT (Figure 1). If the 
SPIESET macro instruction specifies an exception for which the interrup- 
tion has been disabled, the control program enables the interruption 
when the macro instruction is issued. 

Displacement in bytes 

0 1 2 3 4 5 



I Pro- ^ 
0000 [gram 



Exit Routine Address 



Interruption Types 



Figure K Program Interruption Control Area 

Upon execution of the SPIESET, and thus the SPIE, macro, the control 
program creates a 32 byte program interruption element (PIE) in the main 
storage area assigned to the job step (Figure 2). 

Displacement in bytes 
0 1 2 3 



12 

16 

20 

24 
28 



Reserved 



PICA Address 



Old Program 
Status Word 



(interruption 
' — e^des)- 



Register 14 



Register 15 



Register 0 



Register 1 



32 L.. 



Register 2 



Figure 2. Program Interruption Element 



ERIC 



16 

The PICA Address in the program interruption element is the address 

of the program interruption control area created by the expansion of the 

SPIE macro. When control is passed to the generated control section 

(designated in the PICA and hereinafter referred to as the exit routine) 

the program status word and the contents of general purpose registers 

14, 15, 0, 1, and 2 at time of interrupt are stored in the PIE by the 

control program as indicated. The register contents are as follows 

when the exit routine gains control: 

Register 0: internal control program information 

Register 1: address of the program interruption element 

Registers 2-12: same as when the program interruption occurred 

Register 13: address of the save area for the user program 
causing the interruption 

Register 14: return address (to the control program) 

Register 15: address of the exit routine 

When control is passed to the exit routine it stores these register 
contents in a special 16 word save area and then executes a LINK macro 
passing control to the major control section of the error diagnostics 
system (INTERUPT). When INTERUPT returns control, the exit routine 
restores the registers from the special save area and then returns con- 
trol to the control program using the address passed in register 14. 
This does not result in a normal return to the user program, however, 
since the old program status word and some of the register fields of 
the PIE are changed by the error diagnostics system before returning 
control. This procedure is explained in more detail at the end of 
this chapter* 
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Description of INTERUPT 

The INTERUPT control section forms the mainstay of the error diag- 
nostics system. Passed control by the user program's exit routine, it 
performs all the general error diagnosis functions, determines the inter- 
ruption type, and then passes control to the appropriate termination 
analysis routine. 

One of the functions of the INTERUPT control section is the printing 
of both the general purpose and the floating-point register contents at 
the time of the interruption. To this end, it creates two special regis- 
ter save areas. Since the contents of the floating-point registers have 
not been changed since the time of interruption, their contents are 
stored directly into the floating-point register save area. The con- 
tents of general purpose registers 3 - 13 at time of interrupt are 
moved into the special save area from the save area in the user pro- 
gram's exit routine. The contents of registers 14, 15, 0, 1, and 2 at 
time of interrupt are found in the program interruption element and 
moved into the special save area from there. 

Bits 28 - 31 of the PSW at time of interrupt (found in the PIE) 
give the number (1 - 15} of the program interruption which occurred. 
The program interruption number is translated into the corresponding 
system completion code (OCl - OCF) and is printed along with a general 
description of the interruption type taken from a table in the routine. 

The location of the interruption is next printed with the name of 
the user program control section within which it occurs and its relative 
address within that CSECT. If the location of the interrupt falls out- 
side the user's program area, a message to that effect is printed in- 
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stead. In this case the name of the control section being executed at 
the time of interrupt Is printed in the specific error diagnosis section 
of the output. 

Next printed are the contents of the sixteen general purpose and 
four floating-point registers at the time of interruption (now stored 
in the two special register save areas previously mentioned). The 
register contents are printed out in both hexadecimal and decimal rep- 
resentations. For the general purpose registers, the decimal repre- 
sentation is in the form of simple signed Integers; for the floating- 
point registers the form is standard scientific or "E" notation - a 
normalized decimal number with point raised to a power of ten. 

For all precise Interruption types (all those except 0C4 and 0C5, 
protection and addressing, respectively) the hexadecimal representation 
of the eight bytes starting at the location of the interruption is next 
printed. This is followed by a reconstruction of the instruction cau- 
sing the Interrupt whenever this is possible (for all interruption types 
other than OCl , 0C4, and 0C5). This is accomplished by executing a LINK 
macro passing control to routine DECODER which reconstructs the instruc- 
tion from its object code and then returns control to INTERUPT from 
where the reconstructed Instruction is printed. DECODER makes use of an 
instruction list (INSTRLST) which Includes the mnemonic and Instruction 
type for all System/370 instructions, as well as all the program inter- 
ruptions each instruction is capable of causing. The instruction list 
is brought into main storage by means of a LOAD macro instruction execu- 
ted from the INTERUPT control section. It is used by the termination 
analysis routines for the imprecise interruption types 0C4 and 005; for 
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all other interruption types its space is relinquished after DECODER 
returns control, by means of a DELETE macro also executed from INTERUPT. 

This ends the functioning of the general error analysis section. 
The floating-point register contents are restored and control passes 
to one of the fifteen termination analysis routines, depending on the 
interruption type. Control is passed by means of a LINK macro. 

General Descrfption of the_ Temination Analysis Routines 

There are fifteen termination analysis routines, one correspon- 
ding to each of the program interruption types. Rather than being 
general in nature as is the INTERUPT control section, each of the ter- 
mination analysis routines is keyed to one specific interruption type and 
the possible causes for that error condition. They are brought into 
main storage only when needed and invoked by means of a LINK macro 
executed in the INTERUPT control section. Although each routine is a 
separate entity in itself, similarities exist among the functions per- 
formed. This common functioning is the subject of this section^ 

When each routine receives control it prints out a short message 
generally describing the error condition. This message does not attempt 
to give the cause of the error - it is simply a description of the inter- 
ruption type. In the case where the interruption occurred outside the 
user's program area, this line is followed by a line giving the name of 
the control section being executed at the time the interruption occurred. 

Following this is a detailed analysis of the probable cause(s) for 
the error condition, based on the interruption type, the instruction 
causing the exception, its location, and other factors. This processing 
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is detailed in Chapter III where the functioning of each of the fifteen 
termination analysis routines is described. 

At the conclusion of each routine's detailed error analysis, con- 
trol is returned in suchi a way as to force the production of a system 
dump reflecting the original program conditions at time of interruption. 
This is done by placing the address of the instruction causing the 
interrupt in the second half of the old program status word stored in 
the program interruption element. A SPIE macro is then issued speci- 
fying a PICA address of zero, thus cancelling the effect of the SPIESET 
macro. When control is returned the PSW is reloaded from the old PSW 
field of the PIE; this effects a return to the interrupt-causing in- 
struction and causes its subsequent attempted re-execution* When the ^ 
interruption occurs this time, control is not passed to the exit rou- 
tine (due to the effect of the "cancel" SPIE), but instead a system dump 
is produced. This is the standard actioh taken. The system does not 
allow for the resumption of the user program's execution from the point 
of interruption. 

In the case of imprecise interruptions (types 0C4 and 0C5) this 
procedure is modified somewhat. Since it is not always possible to 
determine the instruction causing the exception, it is similarly not 
always possible to force the interruption to recur. In this case the 
hexadecimal code for the ABEND SVC (OAOD) is placed at the interrupt 
location. The contents of the old PSW field in the program interruption 
element are changed to reflect this address and the saved contents of 
register 1 in the PIE are altered to reflect the desired system comple- 
tion code. Control is still returned to the interrupt-causing location, 
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but now an ABEND macro is executed, again producing a system dump re- 
flecting in almost all respects the program conditions at the time of 
the actual interruption. 

This is also done in the case of those interruptions for which 
the operation causing the exception is not suppressed (types 0C7, 0C8, 
OCA, OCC, OCD, and OCE). This is necessitated by the possible elimina- 
tion of the interrupt-causing condition by the total or partial comple- 
tion of the operation. The programmer is notified when this special 
procedure is used by a message included at the end of the error diagno- 
sis output. 



CHAPTER III 
Termination Analysis Routine Logic 
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General 

This chapter detail, the logic of the 15 temination analysis rou- 
tines corresponding to the 15 program interruption types handled by the 
error diagnostics system {system completion codes OCl - OCF), The rou- 
tines are named OCl - OCF, each name reflecting the interruption type 
handled by the program. There is also an 0C45 routine which performs 
the instruction traces required by the 0C4 and 0C5 error diagnosis pro- 
grams. 

Although actually a description of program logic, this Chapter is 
written so that it could be used in itself as a non-automated assembler 
program debugging manual . 

In each case, the standard error message which is printed out is 
given first, followed by a breakdown of the logic used to arrive at the 
possible causes of the interrupt condition, 

OCl - Invalid Operation 

The operation code in the instruction causing the interrupt is in- 
valid - no such operation exists on this computer. 

Check the location of the interrupt (PSW at entry to ABEND): 
1) If the interrupt location is within the user program area, the 
cause of the exception is most probably an attempt to execute data as an 
instruction. A previous move instruction may have placed data in the 
program area, or the contents of the program base register may have been 
destroyed. 

The ABEND PSW will point to the next instruction to be executed 
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in the program. 

2) If the interrupt location is 50, 52, 5002, or 5200 the most 
probable cause of the interrupt condition is an attempt to access an 
unopened or improperly opened data se.t: 

If a GET was attempted, the interrupt location will equal 5002 or 
5200; if a PUT was attempted, the interrupt location will equal 50 or 
52. In either case, the contents of the following registers at time of 
interrupt should be investigated: 

Register 1 points to the Data Control Block (DCB) of the data set 
Involved; 

Register 1 + 40 (decimal) points to its DDNAME; 
Register 14 points to the next instruction to be executed in the 
user program. 

3) If the interrupt location is outside the user program area and 
is not one of the above noted locations, the most probable cause of the 
error condition is a branch to a data area resulting in the fetching of 
data as the operation code for an instruction. A possible cause of this 
condition is the alteration of the program's base register contents. 
Check the contents of register 14 at entry to ABEND for {possibly, but 
not always) the next instruction to be executed in the program or the 
instruction following the last BALR branch to a subroutine. 

0C2 - Privileged Operation 

The instruction causing the interrupt is valid only in the super- 
visor state - it cannot be executed by the user program. 

Check the location of the interrupt (PSW at entry to ABEND): 

ERIC 
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1) If the Interrupt location is within the user progfam area, the 
cause is most probably an attempt by the problem program to issue a 
privileged instruction. 

A less probable cause would be an attempt to execute data as an 
insLruction (an OCT interrupt would much more commonly result from this 
condition). This might be the result of a previous move instruction 
placing data in the program area, or the alteration of the program's 
base register contents. 

In either case, the ABEND PSW will point to the next instruction to 
be executed in the user program. 

2) If the interrupt location is outside the user program area, the 
most probable cause of the error condition is a branch to a data area 
resulting in the fetching of data as the operation code for an instruc- 
tion (although an OCl would, again, much more commonly insult from this 
condition). This might be the indirect result of the alteration of the 
user program's base register contents. Check the contents of register 
14 at entry to ABEND for (possibly, but not always) the next instruction 
to be executed in the program or the instruction following the last 
BALR branch to a subroutine. 

0C3 - Invalid EXecute 

The subject of the EXecute instruction causing the interrupt is 
another EXecute instruction. 

The instruction causing the interruption is the first EXecute 
instruction. It, in turn, points to the subject EXecute in<;truction . 
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0C4 - Protection 

The user program has attempted to change the contents of a protected 
area of main storage. 

Possible causes of this error are; 

1) If register 15 at time of interrupt contains 02005000 or 
02005200, the cause is most probably an attempt to execute a GET macro 
referring to an unopened or improperly opened data set. 

2) If register 15 at time of interrupt contains 02000050 or 
02000052, the cause is most probably an attempt to execute a PUT macro 
referring to an unopened or improperly opened data set. 

In either case, register 14 at time of interrupt will point to the 
next instruction to be executed in the user program. 

3) The error might also result from an uninitialized index, or an 
attempt to index outside the program's assigned limits. This condition 
might be indirectly caiKp.d by the alteration of the user program* s base 
register contents. 

If the interrupt location is outside the user program area, the name 
of the CSECT being executed at time of interrupt is printed. 

Due to the instruction look-ahead feature of many models of the 
System/360 - 370 computer series, protection interruptions are imprecise. 
If the interrupt location is witliin the user program area, control is 
passed to the 0C45 routine which prints out the instructions immediately 
preceding the interrupt location, flagging those that could have caused 
the interruption. Only those instructions that attempt to store data in 
main storage can cause a protection interruption. 
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PCS - Addressing 

An address of a specified instruction or data is outside the limits 
of the computer's aval lable 'storage. 

Possible causes of this error are the specification of an invalid 
data address, an invalid index,' or an attempt to index outside the pro- 
gram's assigned limits. Any of these causes might be the indirect re- 
sult of the alteration of the user program's base register contents. 

If the interrupt location is outside the user program area, the 
name of the CSECT being executed at time of interrupt is printed. 

Due to the instruction look-ahead feature of many models of the 
SystGm/360 - 370 computer series, addressing interruptions are impre- 
cise. If the interrupt location is within the user program area, con- 
trol is passed to the 0C45 routine which prints out the instructions 
immediately preceding the interrupt location, flagging those that could 
have caused the interruption. Any instruction that accesses main storage 
in any way can cause an addressing interruption, 

0C45 - Traces Instructions for 0C4 and 0C5^ Interruptions 

Due to the instruction look-ahead feature of many models of the IBM 
System/360 - 370 computer series, addressing and protection interruptions 
are imprecise, and it is thus often impossible to determine the exact in- 
struction causing the interrupt condition. If the location of the inter- 
rupt occurs inside the user program area, both the 0C4 and 0C5 termina- 
tion analysis routines pass control to the 0C45 module, whose function 
is to reconstruct (using the DECODER routine) and print out the instruc- 
tions immediately preceding the interrupt location. 

ERIC 
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Starting 36 bytes back from the Interrupt location, the routine 
attempts to decode all the object code down to and including the inter- 
rupt location. If the attempt fails at some point (i.e., if there is 
object code not corresponding to a valid instruction), the process be- 
gins again, starting 34 bytes back from the interrupt location. If 
this attempt fails, the process begins again at 32 bytes, 30 bytes, 
etc., until all the object code down to and including the interrupt 
location can be successfully translated to valid IBM System/370 assem- 
bler instructions. All the reconstructed instructions are printed along 
with their object code and absolute and relative memory addresses. 

The routine also flags all those instructions that possibly could 
have caused the interrupt condition. It does this by comparing the 
list of program interruptions each instruction is capable of causing 
(coded along with the mnemonic and instruction type code in the instruc- 
tion list) with the interrupt condition that has occurred. If a match 
is found the instruction is flagged. This is designed to aid the de- 
bugging process by limiting the number of instructions a programmer need 
consider in determining the reasons for his program's abnormal ter- 
mination. In some instances only one instruction will be flagged, most 
probably indicating precisely the interrupt-causing instruction. 

It should be noted that only those instructions physically preceding 
the interrupt location are decoded and printed. It is possible that a 
branch might have been executed to the region of the interrupt location 
after the interrupt-causing instruction was issued, but before the in- 
terruption condition was detected. This circumstance would invalidate 
the chain of instructions printed, but it is impossible to programmati- 
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cally determine if this has, indeed, occurred. The user should consider 
this possibility before relying on the diagnostic data provided. 

0C6 - Specification 

t 

An operand specification in the instruction causing the interrup- 
tion is incorrect. 

Possible causes of this error are: 

1) An odd register operand address is specified in an instruc- 
tion requiring an even register operand address (i.e., D, DR {first 
operand), M» MR {first operand) » and the double shift operations). 

2) An invalid floating-point register address (1, 3, 5, 7-15) 
is specified. Only 0 or 4 can be specified for an extended operand. 

3) A branch has been attempted to an odd address. The ABEND 
PSW points to the branch destination. 

4) The length of the second operand of a MP or DP instruction 
is greater than 8. 

5) The first-operand field is shorter than or equal to the 
second-operand field in a MP or DP instruction. 

6) An instruction address does not designate a location on an 
even-byte boundary. 

7) The block address in an SSK or ISK instruction does not have 
zeroes in the four low-order bit positions. 

8) An operand address does not designate an integral boundary in 
an instruction requiring such integral boundary designation. This con- 
diti-on cannot occur on Sys ten/370 comouters for any instructions other 
than Compare and Swap (CS) and Compare Double and Swap (CDS) due to the 
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byte-oriented operand feature. 

9) A PUT or GET macro instruction has been attempted referring to 
an unopened or improperly opened data set, or a data set with a DDNAME 
whose corresponding OD JCL card is missing or misspelled. All system 
indicators upon either of these two conditions are identical, and all 
addresses and register contents at time of interrupt are the same as 
those for the OCl interrupts caused by. the same conditions. 

0C7 - Data 

Data in a field is of incorrect format for the instruction attemp- 
ting to process it. 

Possible causes of this error are: 

1) The sign and/or digit codes of operand(s) used by a CVB, AP, 
SP, ZAP, CP, MP, DP, ED, or EDMK instruction are invalid, i.e., not in 
the packed decimal format. 

2) The operand fields in a AP, CP, OP, MP, or SP instruction over- 
lap in a way other than with coincident rightmost bytes; or operand 
fields in a ZAP instruction overlap, and the rightmost byte of the second 
operand is to the right of the rightmost byte of the first operand. 

3) The first operand of a MP instruction has too few high-order 
zeros. 

Any of these three error causes may result from: 

a) an uninitialized data field (e.g., blanks might have been read 
into a field designed to be processed with packed decimal instructions), 
or 

b) an incorrect or uninitialized index, resulting in invalid data 
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being referenced. 

A data exception occurs during the execution of an instruction, and 
the operation is suspended ^t that point; the contents at the result 
field are unpredictable. An exception to this rule is that, except for 
ED and EDMK instructions, the operation is suppressed when a sign code 
Is invalid, regardless of whether any other condition causing the ex- 
ception is present, 

0C8 - Fjxed-Pouvt Overflow 

Two distinct causes of this error condition are possible: 

1) A carry has occurred out of the high-order bit position of the 
result register in a fixed-point arithmetic operation - the result of 
the operation causing the interrupt is too large to be expressed in 32 
bits in the 2's complement form, 

2) High-order significant bits have been lost during an algebraic 
left shift operation. 

In either case, the operation is completed with the result left in 
the register too large or too small by an even multiple of 2**31, The 
interruption could be suppressed by setting PS17 bit 36 to a zero. This 
would cause the overflow condition to be ignored, 

0C9 - Fixed-Point Divide 

Two distinct causes of this error condition are possible: 
1) The program has attempted a binary integer division by zero, 
or the development of a quotient too large to be exoressed in 32 bits 
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In the 2's complement form (caused by either a D or OR instruction). 
The operation has been suppressed. 

2) The result of a CV8 instruction is too large to be expressed 
in 32 bits in the 2's complement form. The operation has been completed 
by ignoring the high-order bits that cannot be placed in the register. 

OCA - Decimal Overflow 

The destination field in the decimal operation AP, SP, or ZAP is 
too small to contain the result, forcing one or more significant high- 
order digits to be lost. 

The result has been truncated on the left but lower-order digits 
and the sign are exactly as they would be in a longer field sufficient 
to hold the result. 

The interrupt could be suppressed by setting PSW bit 37 to a zero. 
This would cause the overflow condition to be ignored. 

OCB - Decimal Divide 

The quotient formed by the decimal division (DP) instruction causing 
the interruption exceeds the specified data field size. This might be 
the result of an attempted division by zero. 

OCC - Exponent Overflow 

The result of the floating-point ooeration causing the interruption 
is 16**64 or greater - the result characteristic exceeds 127 (hexadeci- 
mal 7F) and the result fraction is non-zero. 

ERIC 
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The overflow can occur as the result of a carry-out.of the high- 
order fraction position during normalized or unnormalized addition or 
subtraction and the following characteristic adjustment after a right 
shift, or during the characteristic computations in mul tiolication or 
division, 

The operation has been completed. The fraction is normalized, and 
the sign and fraction of the result remain correct. The result charac- 
teristic has been made 128 smaller than the correct characteristic. 

OCD - Exponent Underflow 

The result of the floating-point operation causing the interruption 
is smaller than 16**-64 - the result characteristic is less than zero and 
the result fraction is not zero. 

This condition could occur as the result of normalization during 
normalized addition or subtraction. It might also result from multipli- 
cation, division, or halving, 

The operation has been completed with a result whose characteristic 
is 128 larger than the correct characteristic. The fraction is norma- 
lized, and the sign and fraction remain correct. 

The Interrupt could be suppressed by setting PSW bit 38 to a zero. 
This would cause the operation to be completed by reolacing the result 
with a true zero. 

OCE - Significance 

The result fraction of a floating-point addition or subtraction 
(normalized or unnorrnal i ^ed) is zero - all significant digits of the 
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result have been lost* 

The operation has been completed without further change to the 
characteristic and sign of the result. 

The interrupt could be suppressed by setting PSW bit 39 to a zero. 
This would cause the operation 'to be completed by replacing the result 
with a true zero, 

OCF - Floating-Point Divide 

A floating-point division by a number with a zero fraction has been 
attempted. The operation ha<; been suppressed with the dividend left 
unchanged. 



CHAPTER IV 



CONCLUSIONS AND DIRECTIONS 
FOR FUTURE IWRK 
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Conclusions 

The major goal of this thesis has been the development of an im- 
proved error diagnostics system for use in the post-mortem debugging of 
assembler language computer programs. That goal has largely been 
reached, and the system described herein represents a significant im- 
provement over the facilities previously available. 

The long range goal of debugging research should be the elimina- 
tion or at least a greatly reduced reliance on system dumps for post- 
mortem debugging analysis. While this objective is still to t3e rea- 
llzedi this system hopefully represents an important first step in 
this direction. For many error conditions, the diagnostic output provi- 
ded will be sufficient for the user to accurately determine the source 
of his error. In those cases where it does not suffice, its use in 
conjunction with the system dump (which is still available at the pro- 
grammer's option) should soeed the debugging process. 

While not being all inclusive, the system presently handles many 
of the errors commonly made by beginning and intermediate assembler 
language programmers, and is especially well-suited for use in a stu- 
dent environment. As the system is enlarged to handle more termination 
conditions, and those that it does handle in more detail, it should find 
application in numerous other environments* 

One looks forward to the day when systems such as this one will 
replace the system dump as the standard manufacturer-supplied post- 
mortem error diagnosis package. This will only come to pass when the 
error diagnoses provided yield as much information as do the present 
system dumps, but with far less extraneous data and in a much more 



intelligible format, A great deal more work will need to be done before 
this aim becomes reality* One may even visualize operating systems and 
other software packages being designed in the future with the debugging 
function specifically in mind. Much more research is needed in this 
very practical area. 

Directions for Future Work 

The error diagnostics system as currently configured represents only 
a starting point towards the development of a truly comprehensive post- 
mortem debugging system for the IBM System/360 - 370 computer series. 

The present system is capable of trapping and handling only the 
program interruption types OCl - OCF. An obvious extension to the system 
would be to include in it caoabilities for handling all the other abnor- 
mal termination error conditions. However this would necessitate re- 
vamping much of the system structure as the SPIE macro allows for hand- 
ling only the fifteen program interruption types. Also, the system as 
currently configured is not designed for handling input/output related 
errors well since buffers are not saved. This would have to be changed 
in a future version of the system. 

Less far-reaching and more immediately realizable extensions would 
be to exoand the system to cover more error conditions resulting in pro- 
gram interruptions; this will in fact bf>r.omf^ nh^ces^^ary as error condi- 
tions are discovered that the program is not designed to handle. In par- 
ticular, there has been no attempt to include logic covering error con- 
ditions related to the imorooer use of the macros of any access method 
other than QSAM, and even some of the less commonly used QSAM configu- 
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rations have not been fully tested. Inclusion of such features would 
be extremely desirable. 

Significant Improvements might also be made to the way in which the 
imprecise interruption types are handled. Through a process of "instruc- 
tion reversal" ft might be possible to analyze each instruction preceding 
the interrupt location to determine which instruction actually did ge- 
nerate the invalid address causing the interruption, rather than simply 
noting all the instructions that could have. However this scheme would 
only work in some cases and an instruction reversing procedure would be 
extremely difficult to design and implement. 

Many other improvements could also be made in the user-system inter- 
face. At present this interface is extremely simple but allows the user 
very few options. Possible enhancements include letting the user specify 
his own exit routine for handling some interruptions while allowing him 
to use the error diagnostics system for other error types. In the case 
of maskable interruption types especially, it might be desirable to let 
the user specify whether he desires his program to terminate or resume 
execution after the diagnostic information is printed. Both of these 
enhancements could be implemented through the use of optional parameters 
in the SPIESET macro. A reentrant version of the SPIESET macro might 
also be desirable, allowing the system to be used by a program actually 
being executed simultaneously by multiple users. 



APPENDIX 
SAMPLE PROGRAMS 
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Sample Progra m J 1^ 

The first sample program attempts to execute a PUT macro referring 
to an unopened data set. This results in an invalid operation (nci) 
interruption. The SPIESET riacro is set to trap all fifteen program 
interruptions. Also included in this example is a listing of all the 
job control language statements necessary to use the error diagnostics 
system. Following the diagnostic output are the first three pages of 
the standard system dump which is optionally produced. 
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Sample Progra m # 2 

The second sample program generates an imprecise protection (0C4) 
interruption by attempting a store into absolute location zero. The 
eleven instructions preceding and including the interrupt location are 
decoded and printed, with only the one instruction actually causinci the 
interruption being flagged. The first reconstructed instruction 
SU 5,2574(9,0) does not correspond to an actual program instruction, 
but is a valid reconstruction of data flags expanded as part of the 
SPIESET macro. There is no way to programmaticallv avoid this. The 
succeeding ten instructions do correspond to actual program instructions. 
Following the diagnostic output are the first three pages of the slightly 
modified system dump produced at the programmer's option. In this 
example, the SPIESET macro is set to trap only the non-maskable pro- 
gram interruptions. All the maskable interruption types are disabled; 
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Sample Program ^3 



The third sample program generates a fixed-point overflow (0C8) 
interruption during an Add Register operation. The SPIESET macro is 
set to trap only 0C8 error types, thus enabling the interruption. 
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